Approximate inference using conditional entropy decompositions

نویسندگان

  • Amir Globerson
  • Tommi S. Jaakkola
چکیده

We introduce a novel method for estimating the partition function and marginals of distributions defined using graphical models. The method uses the entropy chain rule to obtain an upper bound on the entropy of a distribution given marginal distributions of variable subsets. The structure of the bound is determined by a permutation, or elimination order, of the model variables. Optimizing this bound results in an upper bound on the log partition function, and also yields an approximation to the model marginals. The optimization problem is convex, and is in fact a dual of a geometric program. We evaluate the method on a 2D Ising model with a wide range of parameters, and show that it compares favorably with previous methods in terms of both partition function bound, and accuracy of marginals. Graphical models are a powerful tool for representing multivariate distributions, and have been used with considerable success in numerous domains from coding algorithms to image processing. Although graphical models yield compact representations of distributions, it is often very difficult to infer simple properties of these distributions, such as the marginals over single variables, or the MAP assignment. This difficulty stems from the fact that these problems involve enumeration over an exponential number of assignments, and has motivated extensive research into approximate inference algorithms. Another problem, which turns out to have a key role in developing inference algorithms, is the calculation of the partition function. Recent works (Wainwright & Jordan, 2003; Yedidia et al., 2005) have illustrated that a variational view of partition function estimation can be used to analyze most of the previously introduced approximate inference algorithms, such as mean field, belief propagation (BP) and the tree re-weighting (TRW) framework (Wainwright et al., 2005). The above analyzes emphasize that a key ingredient in most approximate inference algorithms is the estimation of the entropy of a graphical model given marginals over subsets of its variables. This approximation may be an upper bound on the true entropy, as in the TRW framework, or one which is not guaranteed to be a bound as in the Kikuchi entropies used in Generalized Belief Propagation (GBP) (Yedidia et al., 2005). Another important property of entropy approximation is convexity. The TRW entropies are convex whereas those of GBP are not necessarily convex. In the current work, we introduce a novel upper bound on graphical model entropy, which results in a convex upper bound on the partition function. The bound is constructed by decomposing the full model entropy into a sum of conditional entropies using the entropy chain rule (Cover & Thomas, 1991), and then discarding some of the conditioning variables, thus potentially increasing the entropy. This entropy bound is then plugged into the variational formulation, resulting in a convex optimization problem that yields an upper bound on the partition function. As with previous methods (Yedidia et al., 2005; Wainwright et al., 2005), a byproduct of this optimization problem is a set of pseudo-marginals which can be used to approximate the true model marginals. We evaluate our Conditional Entropy Decomposition (CED) method on a two dimensional Ising grid, and show that it performs well for a wide range of parameters, improving on both TRW and belief propagation. 1 Definitions and Notation We shall be interested in multivariate distributions over a set of variables x = {x1, . . . , xn}. Consider a set C of subsets C ⊆ {1, . . . , n}. Denote by xC an assignment to the variables xi such that i ∈ C. A distribution over x will be parameterized using functions θ(xC). We denote by θ the vector of all parameters for C ∈ C. These can be used to define an exponential distribution over x given by

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Inference in Graphical Models using Tensor Decompositions

We demonstrate that tensor decompositions can be used to transform graphical models into structurally simpler graphical models that approximate the same joint probability distribution. In this way, standard inference algorithms such as the junction tree algorithm, can be used in order to use the transformed graphical model for approximate inference. The usefulness of the technique is demonstrat...

متن کامل

A Preferred Definition of Conditional Rényi Entropy

The Rényi entropy is a generalization of Shannon entropy to a one-parameter family of entropies. Tsallis entropy too is a generalization of Shannon entropy. The measure for Tsallis entropy is non-logarithmic. After the introduction of Shannon entropy , the conditional Shannon entropy was derived and its properties became known. Also, for Tsallis entropy, the conditional entropy was introduced a...

متن کامل

Tsallis Entropy and Conditional Tsallis Entropy of Fuzzy Partitions

The purpose of this study is to define the concepts of Tsallis entropy and conditional Tsallis entropy of fuzzy partitions and to obtain some results concerning this kind entropy. We show that the Tsallis entropy of fuzzy partitions has the subadditivity and concavity properties. We study this information measure under the refinement and zero mode subset relations. We check the chain rules for ...

متن کامل

Approximate Entropy Reducts

We use information entropy measure to extend the rough set based notion of a reduct. We introduce the Approximate Entropy Reduction Principle (AERP). It states that any simplification (reduction of attributes) in the decision model, which approximately preserves its conditional entropy (the measure of inconsistency of defining decision by conditional attributes) should be performed to decrease ...

متن کامل

Bounding the Partition Function using Holder's Inequality

We describe an algorithm for approximate inference in graphical models based on Hölder’s inequality that provides upper and lower bounds on common summation problems such as computing the partition function or probability of evidence in a graphical model. Our algorithm unifies and extends several existing approaches, including variable elimination techniques such as minibucket elimination and v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007